Data Hash and public Data
Data Structure
The hash is created from the key data, in JSON format, for a given transaction and includes most of the data in the database. The plain text JSON data can be used to rerecreate the hash. The plain text JSON is publicly available so that a user can self-verify the hash (SHA256).
The input plain text for a given log entry consists of 5 data groupings:
Event Details: what happened when (e.g. item added, locked, withdrawn, etc.)
Parcel Details: characteristics and unique IDs of the item (e.g. 1,000 gram JM gold bar, serial number: JM3533642)
Parcel Usage: how the parcel is used by service providers
Image Hash: hash of the photo taken with the RFID scanner by vault personnel
Prior Event Hash: this links one event to the prior one, creating a HashChain. Any alteration to a past event will therefore invalidate all subsequent event hashes when hashes are verified.
Blockchain Immutability & Costs
The above hash is written to a public blockchain to provide immutable proof of event posting date and content. Public blockchains are a much stronger data guarantee than equivalent proprietary implementations which are ultimately controlled by the system’s administrator(s).
However public blockchains are slower and require transaction fees. Our shah256 hash will require 256 bits (being 32 bytes) per posted record.
Hash Data Details
A description for each field that is part of the data hash and therefore made public on the blockchain:
Key Fields | Description |
---|---|
rfid | The tamper evident metal compatible RFID Code that was applied by the vault operator for a given parcel. This is the primary tracking ID. |
uniqueserial | A composite name made up of parcel characteristics which make it unique. This can be used as a secondary ID and is more meaningful than a RFID. It is made up as follows: MetalCode + Weight + Type + BrandCode + "-" + SerialNR Should the parcel be a tamper evident bag containing multiple items the following naming is applied: MetalCode + Quantity + ‘x’ + Weight + Type + BrandCode + "-" + SerialNR The system does not support mixed items in tamper evident bags. This field can also be displayed in “full text”whereby Metal, Type and Brand are mapped to their full names (e.g. “AU” would be shown as ”Gold” in English). |
priorloghash | The hash of the prior log entry. This is used as input to the current hash. Hence any missing or modified log records for a given RFID would become obvious when a data hash check is performed. |
Event Details | Description |
---|---|
eventdate | This is the UTC time that the entry was submitted to the GramChain API. |
eventtypecode | The purpose of the entry. The type of events depend on whether a storage provider with a scanner submitted the entry or whether a service provider submitted an entry. This field can also be shown in a full text format for a given language with a description. |
InitiatorID | This is the storage or service provider that initiated the event. For example, a vault scan would be initiated by vault personnel performing an item scan or an external auditor performing an audit event. |
devicehash | This is the device used by the initiator to submit the event. For a vault, this would be the MAC address of the scanner assigned to the vault. The devicehash is recorded for tracking and auditing purposes. For security purposes the data is hashed and salted. By recording this data, it is possible to prove which device logged a given event. For the general public this entry is meaningless and can be ignored. |
operatorhash | This is the login used by the initiating representative to submit the event. The operatorhash is also recorded for tracking and auditing purposes. For security purposes the data is also hashed and salted. By recording this data, it is possible to prove which login submitted a given event. For the general public this entry is meaningless and can be ignored |
Parcel Details | Description |
---|---|
storageproviderid | The vault identifier where the parcel was scanned. This field can also be shown in a full text format for a given language to show the vault operator full name, address and contact information. |
itemquantity | The number of items in the parcel. In case of a single item this field would be one. Should a tamper evident container be used to group multiple items then this field would record the number of items. For example, a tamper evident bullion bag containing 5 bars would have a value of 5. The items must be of the same type as the GramChain does not support mixed items in tamper evident bags. |
grossgrams | The gross stated mass (weight) of each item as shown in the photo, converted to grams. For example, a bullion bar stating 100 troy ounces would result in 100 x 31.1035 = 3110.35 grams. Grossgrams and pure grams use three decimal accuracy. Conversions between troy ounces and grams use four decimal accuracy and troy ounces are shown with four decimals. |
purity | Minimum purity of the primary metal as stated on each item and as show in the photo. |
puregrams | Gross grams multiplied by purity to determine the pure grams of the primary metal. In case of alloys only the amount of pure content, in grams, of the primary metal is recorded. Secondary metals are not recorded. |
metalcode | The chemical symbol of the metal. Gold for example would be Au. This field can also be shown in a full text format (e.g. gold) for a given language. |
brandcode | The acronym or short code of well-known manufacturers. The brandcode for the Royal Canadian Mint for example would be RCM. This field can also be shown in a full text format (e.g. Full Name) for a given language. |
itemtypecode | The code, in English, for the type of parcel. Sample values are “Bar”, “Coin”, “BulkBag”, Barrel. This field can also be shown in a full text format for a given language. |
serialnr | The serial number on the item. In case of multiple items inside a parcel the serialnr is the serial number of the parcel. Should the bar contain a refinery code this will be added. |
imagehash | The non-salted hash of the parcel photo that was taken by the vault. The parcel photo is saved as a file whose name is the hash, allowing for easy retrieval and verification that the image matches the hash entry. The image hash can be verified by serializing the image into Base64 and hashing data using SHA256. |
Parcel Usage | Description |
---|---|
serviceproviderid | The company identifier of the service provider that has a claim on the parcel. This field can also be shown in a full text format for a given language to show the service provider full name, address and contact information. Storage providers record what and where the parcel is. This is done by vault personnel who do have access to ownership or other legal details. Service providers are the entities that manage ownership using accounthash, encumbrancestatus, lockstatuscode, and lockdetailhash. |
accounthash | This is an ID that the service provider uses to identify the owner or claimant of a given parcel. The source field is up to 50 characters long and is hashed and salted with a provider specific salt. A service provider can let a client know their hash to enable them to check the correct bars are assigned to the client. In essence the accounthash is a strong anonymous account ID. |
grossgrams | The gross stated mass (weight) of each item as show in the photo, converted to grams. For example, a bullion bar stating 100 troy ounces would result in 100 x 31.1035 = 3110.35 grams. Conversions between troy ounces and grams use four decimal accuracy. |
lockstatuscode | Determines whether the parcel is encumbered and the type of encumbrance. A parcel, for example could be used as collateral for a loan paid out to the accounthash client. |
lockdetailhash | The supporting documentation for the lock in the form of a hash of the supporting document scans. In case of a loan the loan contract with signature (if required) and any other supporting documentation would be combined into a single file. This file is then serialized as Base64 and hashed using SHA256. The file name will be part of the hashed data. The documents will be retained by the service provider. |
Hash Broadcasting & Confirmation Recording
Writing hashes to the Ethereum blockchain is a two-step process consisting of broadcasting the hash, which returns a blockchain generated TrxID, and later confirming that the hash has been permanently written by checking, using the TrxID, that the hash has at least 5 confirmations.
Due to congestion and unforeseen errors it can take a variable amount of time until the 5-confirmation threshold is reached. It could be a few minutes or hours, depending also on the amount of congestion and gas paid. There is also a chance that the broadcast is refused due to insufficient gas or some other unknown reason.
GramChain therefore uses a set of processes to handle such eventualities to ensure all hashes are broadcasted and confirmed within a reasonable amount of time. Described below is how GramChain keeps track of hash status, how it obtains TransactionIDs (TrxIDs) and how it confirms them.
Status Recording
To keep track of Hash Broadcast Status the HashBroadcasts table has 5 fields that will be progressively updated as statuses change. Because Hashes are created and broadcasted before being written to the database these hashes start as broadcasted hashes, having both the Hash and BlockchainTrxID fields set. If a TrxID cannot be obtained the HashBroadcasts entry will not be inserted into the DB since no broadcast happened (case 1 /2 below). See “Hash Broadcasting Optimizations” for re-submit of Hashes.
Once a hash is confirmed (case 3) the TrxIDConfirmed field will be set to the UTC datetime as of the time/date that GramGramChainchain confirmed that a minimum of 5 confirmations were made on the Ethereum Blockchain. Otherwise TrxIDConfirmed will remain null. Filtering by TrxIDConfirmed = null will return unconfirmed transactions.
The Status field has additional information regarding a hash, including a small error status indication. In a case of errors refer to table ErrorLog for detailed error messages for a give hash/TrxID.
Broadcast Processes
When a new event is recorded via one of the Insert APIs in the secured API interface the submitted data is hashed and broadcasted before it is written to the GramChain database. Each Broadcast can be for one or more hashes and occurs via method GetTrxID which in turn calls the blockchain API Interface.
The API interface location is stored in the DB in the GlobalSettings table and can be set to different values based on an Entity ID – which is passed by default - to allow the use of different blockchains. The relevant GlobalSettings names for hashbroadcasts are:
- HashAPI_Client : default ‘http://gramchain-hashes-wrapper.littlebit.sg/’
- HashAPI_RequestBroadcast : default ‘api/addhashes’
When the API is called the Hash(es) is/are broadcasted and the TransactionID (TrxID) for the broadcast is returned. The TrxID is then written into the DB along with the Hash and plaintext data making up the hash. To re-broadcast hashes that errored out for some reason use API secured/ Hash_UnsentBroadcastHashes_Check. This procedure should be run periodically to re-broadcast eventual missing hashes.
Confirmation Processes
Although a hash might have been broadcasted GramChain still requires 5 confirmations to declare the hash as immutably notarized. There are two separate processes to confirm a TrxID hash.
- GramChain API “secured/HashConfirmation”. This API will be called-back automatically by the blockchain API used in the earlier broadcast process once 5 confirmation are set.
- GramChain API “secured/Hash_UnconfirmedTrxID_Check”. When this API is called it will retrieve all unconfirmed transactions in the HashBroadcasts table that are older than 15 minutes and check the status of each one.
The vast majority of transactions should be handled automatically via callbacks to API “secured/HashConfirmation” and should normally occur within 15 minutes. Since this is a callback no external API needs to be called.
Should the HashConfirmation fail the Hash_UnconfirmedTrxID_Check will find and separately check the status confirmation of any unconfirmed transactions that are older than 15 minutes. The 15-minute delay is done to give the HashConfirmation time to process it first. The Hash_UnconfirmedTrxID_Check process calls method GetTrxConfirmations for each unconfirmed transaction using the following GlobalSettings names:
- HashAPI_Client : default ‘http://gramchain-hashes-wrapper.littlebit.sg/’
- HashAPI_RequestConfbyTrxID: default ‘api/checktx/’ + TrxID
If the given TrxID has 5 or more confirmations then the Hash would be set to confirmed by setting TrxIDConfirmed to the UTC datetime of confirmation and recording the number of confirmations under status.
Should the api/checktx be unavailable or the TrxID not be found on the blockchain or some other error occurs the hash would not be confirmed and a errorcode would be set on the status field of HashBroadcasts. The system will also record a more detailed error log in table “ErroLog”. To find errors specific to a given TrxID it is possible to filter the ErrorLog by field ContextID or the more generic ErrorCategory.
Periodic batch executions
The Hash_UnsesntBroadcastHashes_Check and Hash_UnconfirmedTrxID_Check (in that order) process should be run periodically to re-broadcast and re-confirm un-sent/unconfirmed records. In case that notification emails are required to be sent upon failed confirmations the ErrorLog would be the best trigger location.
Hash Broadcasting Optimizations
Every entry on a public blockchain has a cost. As data volumes grow it is important to manage the costs of embedding this data in public blockchains. One way to do this is to batch hashes together so that multiple hashes can be submitted for a given transaction ID to reduce the per-hash cost. As of early 2019 the table below would show typical hash submission costs on Ethereum.
However, batching implies that data will not be broadcasted when the data is written in the DB but will be broadcasted with the next upcoming broadcast batch. To facilitate broadcasting efficiency the following GlobalSettings (see admin interface) can be used to determine broadcast behavior:
Settings for live transaction:
- MinBlkHashesInTrx (e.g. = 1) determines the minimum hashes that can be broadcast. A setting of X will result in the system waiting for X hashes before broadcasting a batch. A setting of 0 would result in no hashes being broadcast and is intended for testing only.
- MaxBlkHashesInTrx (e.g. = 10) determines the maximum number of hashes that will be included as part of a single TransactionID. If max or min is set to 0 nothing will be broadcast.
- If both MinBlkHashesInTrx and MaxBlkHashesInTrx are set to 1 then only current transaction insert hashes are processed without querying additional unpublished hashes from the DB (this implies 1 hash per transaction and is the fastest, batches can act as failsafe)
Settings for Batch Jobs which will broadcast any unbroadcasted transactions periodically regardless of MinBlkHashesInTrx and MaxBlkHashesInTrx settings (Call BroadcastHash(entityid, callerid, “”) to initiate batch):
- MinBlkHashesInBatch (e.g. =5) Same as MinBlkHashesInTrx but for batch jobs
- MaxBlkHashesInBatch (e.g. =10) Same as MaxBlkHashesInTrx but for batch jobs
Examples:
If 56 records need to be submitted the live transaction settings operates as follows:
- If MinBlkHashesInTrx and MaxBlkHashesInTrx are both set to 1 then only the single (current transaction) record will be submitted (the 55 unsubmitted are ignored).
- If set to min 1 max 10 then all missing records will be submitted in 6 transactions (10,10,10,10,10,6)
- If set to min 5 max 10 then all missing records will be submitted in 6 transactions but the system will wait for min 5 unsubmitted transactions before submitting the next broadcast, this can cause transaction broadcast to take a long time to be broadcasted (as we do not know when transactions are submitted.
If 56 records need to be submitted the Batch settings operates as follows:
- Same as live transactions. However, a fixed time-based job can be set to run.
- Note: if the transaction Min is more than 1 and the batch process is more than 1 then some transactions may not be published in a timely fashion.
Hash Salts & Privacy
Salts are random strings that are added to unhashed “plain” data strings to prevent third parties from using the public hashes to guess the plain data content. The stronger the data sensitivity the stronger the hash and uniqueness per row should be.
The GramChain however has low-sensitivity data hashes which should remain constant between log entries to increase transparency, thereby requiring Service Provider specific hashes rather than per row hashes. The Hash choices are as follows:
Data | Use | Sensitivity – Hash Salt |
---|---|---|
PasswordHash | Password for login accounts (not part of public blockchain data) | High - 32 char per row Random salt. |
devicehash | Mac Address of Scanner | Medium/Low – 32 char Provider salt |
operatorhash | Login for scanner / service app | Medium/Low – 32 char Provider salt |
accounthash | Client AccountID of Service Provider | Medium – 32 char Provider salt |
lockdetailhash | Client Service details / hash. | None – No Salt as it is a PDF hash |
imagehash | Hash of Image | None – No Salt as it is an image hash. |
priorloghash | Hash of prior log data | None – No Salt. |
The tradeoff between security and transparency for published hashes is to use provider specific salts. The salts make it much more difficult to guess the source text yet it allows users who know the relevant hash to query the public GramChain to ensure their hash is used correctly assigned to the relevant parcels.
Salts are created using the cryptographic RNGCryptoServiceProvider library to ensure enough entropy is present. The 32 alphanumeric string is then attached in front of the plain text value to be hashed. The valid salt values are case-sensitive alpha-numeric [a-z, A-Z, 0-9].